Skip to content

[ALF] Add index adjustment for UTF-8 indices#8056

Open
emilypgoogle wants to merge 5 commits intomainfrom
ep/citation-utf-16
Open

[ALF] Add index adjustment for UTF-8 indices#8056
emilypgoogle wants to merge 5 commits intomainfrom
ep/citation-utf-16

Conversation

@emilypgoogle
Copy link
Copy Markdown
Contributor

@emilypgoogle emilypgoogle commented Apr 21, 2026

The AI Logic endpoints return citation indices based on UTF-8 bytes, but Java and Kotlin use UTF-16 natively. This means that the provided indices are often offset from actual content and can even point out of bounds, making them very difficult to use.

Applies both to citation metadata and grounding.

Testing was added for all validation to ensure that grounding indices match completely with what is provided, currently passing all extant testing. Further testing was added to force grounding with citation using strings which will differ in length in UTF-8 and UTF-16 (the degree symbol and accented letters are multi-byte unicode characters).

Manual testing was done for accurate indices in practice, exhaustive testing is difficult, as citation is generally rarer. Grounding is easy to force and test.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 21, 2026

📝 PRs merging into main branch

Our main branch should always be in a releasable state. If you are working on a larger change, or if you don't want this change to see the light of the day just yet, consider using a feature branch first, and only merge into the main branch when the code complete and ready to be released.

@google-oss-bot
Copy link
Copy Markdown
Collaborator

1 Warning
⚠️ Did you forget to add a changelog entry? (Add the 'no-changelog' label to the PR to silence this warning.)

Generated by 🚫 Danger

@emilypgoogle
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements UTF-8 to UTF-16 index conversion for citations and grounding metadata, ensuring accurate text segment mapping in the public API. Key changes include the introduction of a convertUtf8IndexToUtf16 utility, updates to the Candidate model hierarchy, and the addition of comprehensive unit and instrumentation tests. Feedback highlights a potential out-of-bounds error when handling malformed surrogate pairs, performance inefficiencies due to redundant content scans, and compatibility issues with android.util.Log in non-Android test environments.

Comment thread ai-logic/firebase-ai/src/main/kotlin/com/google/firebase/ai/type/Candidate.kt Outdated
Comment thread ai-logic/firebase-ai/src/main/kotlin/com/google/firebase/ai/type/Candidate.kt Outdated
@emilypgoogle emilypgoogle marked this pull request as ready for review May 5, 2026 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants